9  Clinical Data Tables

9.1 Table name: PERSON

9.1.1 Reading from T_PERSON

NOTE: course_id is the visit reference and unique within departments only. Raw course_id’s are likely to recur across sites.

Destination Field Source Field Logic/remarks
person_id cpr_enc hash(cpr_enc) // 2
The current implementation uses duckdb’s hash function (implemented here; see also https://nullprogram.com/blog/2018/07/31). We halve because duckdb’s hash() returns a uint64 (unsigned big integer) but we want a normal int64 to make the final CDM compatible with e.g. PostgreSQL which doesn’t ship with uint64 support.
gender_concept_id c_kon ‘K’ ~ 8532
‘M’ ~ 8507
else drop person
year_of_birth d_foddato Extract year
month_of_birth d_foddato Extract month
day_of_birth d_foddato Extract day
birth_datetime d_foddato Set time to 00:00:00
race_concept_id Map to 0
ethnicity_concept_id Map to 0
location_id NULL
provider_id NULL
care_site_id NULL
person_source_value cpr_enc 'cpr_enc|<cpr_enc'>
gender_source_value c_kon 'c_kon|<c_kon>'
gender_source_concept_id NULL
race_source_value NULL
race_source_concept_id NULL
ethnicity_source_value NULL
ethnicity_source_concept_id NULL

9.2 Table name: DEATH

9.2.1 Reading from T_PERSON

Destination Field Source Field Logic/remarks
person_id PERSON.person_id Only for those patients in PERSON with c_status = 90
Look up the person_id in PERSON by matching the 'cpr_enc|<cpr_enc>' with PERSON.person_source_value
death_date d_status_hen_start When c_status == 90 [dead]. Format: YYYY-MM-DD
death_datetime NULL
death_type_concept_id 32879 Registry
cause_concept_id NULL
cause_source_value NULL
cause_source_concept_id NULL

9.3 Table name: VISIT_OCCURENCE

9.3.1 Reading from course_metadata, environment variable (SHAK_code) and SHAK code lookup file

COURSE_METADATA has a long format, effectively yielding a key-value data format so that a variable’s name is looked up in the variable column and the corresponding value is used.

Destination Field Source Field Logic/remarks
visit_occurrence_id hash(<shak_code>|<courseid>)
See PERSON.person_id for details on hashing
person_id PERSON.person_id
visit_concept_id SHAK_LOOKUP.department_type Use environment variable DEPARTMENT_SHAK_CODE to find the department in SHAK_LOOKUP. Then, use CONCEPT_LOOKUP.concept_id where CONCEPT_LOOKUP.concept_string == SHAK_LOOKUP.department_type and CONCEPT_LOOKUP.filter == 'care_site'
visit_start_date value When variable == 'admdate' use corresponding value. Otherwise, use admdatetime cast to date.
visit_start_datetime value When variable == 'admdatetime' use corresponding value. Otherwise, use admdate with 00:00:00 suffix
visit_end_date value When variable == 'dischdate' use corresponding value. Otherwise, use dischdtuse cast to date
visit_end_datetime value When variable == 'dischdtuse' use corresponding value. Otherwise, use dischdate with 00:00:00 suffix
visit_type_concept_id 32817 EHR
provider_id NULL
care_site_id CARE_SITE.care_site_id Join with CARE_SITE on CARE_SITE.care_site_source_value == 'department_shak_code|<DEPARTMENT_SHAK_CODE>'
visit_source_value course_id 'course_id|<course_id>'
visit_source_concept_id NULL
admitted_from_concept_id If key == transfromid and value IS NOT NULL, then look up value in CONCEPT_LOOKUP.
The source data contain the following values:
  • “Non-ICU dept. - this hospital”
  • “Other ICU – other hospital”
  • “Non-ICU dept. - other hospital”
  • “” (blank)
  • “Other ICU – this hospital”
  • “Emergency room”
admitted_from_source_value Use 'transfromid\|<value>'
discharged_to_concept_id If key == ‘chkouttoid’, look up value in CONCEPT_LOOKUP.
The source data contain the following values:
  • “Non-ICU dept. - this hospital”
  • “Other ICU – other hospital”
  • “Non-ICU dept. - other hospital”
  • “” (blank)
  • “Other ICU – this hospital”
  • “Home”
discharged_to_source_value Use 'chkouttoid|<value>'
preceding_visit_occurrence_id NULL

9.4 Table name: VISIT_DETAIL

Not in scope.

9.5 Table name: STEM

Most of the columns here come from CONCEPT_LOOKUP_STEM. When they do not, the origin table is denoted as prefix in the Source Field.

Destination Field Source Field Logic/remarks
domain_id std_code_domain
datasource Appropriate idenfier of the provenance of the data (e.g. file name)
stem_id Auto-generated integer
person_id PERSON.course_id PERSON.person_source_value == 'cpr_enc|<SOURCE_TABLE.cpr_enc>'
concept_id mapped_standard_code Joining source data with CONCEPT_LOOKUP_STEM depends on the type of source data. For details, please refer to the actual implementation here. Generally, the idea is that for categorcial values, we join on '<SOURCE_TABLE>.<variable>__<value>' == CONCEPT_LOOKUP_STEM.source_concept_code, while we join on '<SOURCE_TABLE>.<variable>' == CONCEPT_LOOKUP_STEM.source_concept_code for numerical and free-text values. The standard concept id’s of free-text values, then, are fetched from CONCEPT_LOOKUP. This way, free-text values serve as an extension of categorical values, when there are so many possible values that explicitly mapping them each via CONCEPT_LOOKUP_STEM would be too cumbersome.
start_date <start_date> Use column with the name defined in the source field, cast to DATE
start_datetime <start_date> Use column with the name defined in the source field, cast to TIMESTAMP after converting to Europe/Copenhagen timezone
end_date <end_date> Use column with the name defined in the source field, cast to DATE
end_datetime <end_date> Use column with the name defined in the source field, cast to TIMESTAMP after converting to Europe/Copenhagen timezone
type_concept_id type_concept_id
provider_id NULL
visit_occurrence_id VISIT_OCCURRENCE.visit_occurrence_id Join with VISIT OCCURRENCE on VISIT_OCCURRENCE.visit_source_value == 'courseid|<SOURCE_TABLE>.courseid'
visit_detail_id NULL
care_site_id NULL
source_value The exact source_value depends on the provenance of the data, but generally consists of the variable name and the value separated by two underscores.
source_concept_id Will be CONCEPT_LOOKUP_STEM.uid for everything except drug administrations with an ATC code. For these, the source_concept_id will either be for the ATC concept (for drugs with custom mappings, in CONCEPT_LOOKUP_STEM) or for the ingredients (the rest, called automapped)
quantity_or_value_as_number Used in all target tables that have a quantity of value_as_number column. Combined into one in the stem table for efficient routing, see ?sec-stem-table-mapping.
value_as_string value_as_string
value_as_concept_id value_as_concept_id Used for value_type == 'categorical' to encode the different levels of the categorial variable
unit_concept_id unit_concept_id
value_source_value value
unit_source_concept_id NULL
unit_source_value unit_source_value
verbatim_end_date NULL
days_supply NULL
dose_unit_source_value dose_unit_source_value
modifier_concept_id modifier_concept_id
modifier_source_value NULL
measurement_datetime NULL
operator_concept_id operator_concept_id
range_low Coalesce of the lower bound as per the source data, when available, and CONCEPT_LOOKUP_STEM.range_low.
range_high Coalesce of the upper bound as per the source data, when available, and CONCEPT_LOOKUP_STEM.range_high.
stop_reason NULL
refills NULL
sig NULL
route_concept_id Join with CONCEPT_LOOKUP on CONCEPT_LOOKUP.concept_string == PRESCRIPTIONS.epaspresadmroute and use CONCEPT_LOOKUP.concept_id
route_source_value PRESCRIPTIONS.epaspresadmroute
era_lookback_interval era_lookback_interval
lot_number NULL
unique_device_id NULL
production_id NULL
anatomic_site_concept_id NULL
disease_status_concept_id NULL
specimen_source_id NULL
anatomic_site_source_value NULL
disease_status_source_value NULL
condition_status_concept_id NULL
condition_status_source_value NULL
qualifier_concept_id NULL
qualifier_source_value NULL
event_id NULL
event_field_concept_id NULL
episode_id_source NULL

9.6 Table name: CONDITION_OCCURRENCE

9.6.1 Reading from STEM (filtered on domain_id = ‘Condition’)

Destination Field Source Field Logic/remarks
condition_occurrence_id uid
person_id person_id
condition_concept_id concept_id If environment variable INCLUDE_UNMAPPED_CODES == 'FALSE' (default), we discard records whose concept_id’s are 0 or NULL
condition_start_date coalesce(start_date, end_date)
condition_start_datetime coalesce(start_datetime, start_date::timestamp, end_datetime, end_date::timestamp)
condition_end_date coalesce(end_date, start_date)
condition_end_datetime coalesce(end_datetime, end_date::timestamp, start_datetime, start_date::timestamp)
condition_type_concept_id type_concept_id
stop_reason NULL
provider_id NULL
visit_occurrence_id visit_occurrence_id
visit_detail_id NULL
condition_source_value source_value
condition_source_concept_id source_concept_id
condition_status_source_value NULL

9.7 Table name: PROCEDURE_OCCURRENCE

9.7.1 Reading from STEM (filtered on domain_id = ‘Procedure’)

Destination Field Source Field Logic/remarks
procedure_occurrence_id uid
person_id person_id
procedure_concept_id concept_id If environment variable INCLUDE_UNMAPPED_CODES == ‘FALSE’ (default), we discard records whose concept_id’s are 0 or NULL
procedure_date coalesce(start_date, end_date)
procedure_datetime coalesce(start_datetime, start_date::timestamp, end_datetime, end_date::timestamp)
procedure_end_date coalesce(end_date, start_date)
procedure_end_datetime coalesce(end_datetime, end_date::timestamp, start_datetime, start_date::timestamp)
procedure_type_concept_id type_concept_id
modifier_concept_id modifier_concept_id
quantity quantity
provider_id NULL
visit_occurrence_id visit_occurrence_id
visit_detail_id NULL
procedure_source_value source_value
procedure_source_concept_id source_concept_id
modifier_source_value modifier_source_value

9.8 Table name: DEVICE_EXPOSURE

9.8.1 Reading from STEM (filtered on domain_id = ‘Device’)

Destination Field Source Field Logic/remarks
device_exposure_id uid
person_id person_id
device_concept_id concept_id
device_exposure_start_date coalesce(start_date, end_date)
device_exposure_start_datetime coalesce(start_datetime, start_date::timestamp, end_datetime, end_date::timestamp)
device_exposure_end_date coalesce(end_date, start_date)
device_exposure_end_datetime coalesce(end_datetime, end_date::timestamp, start_datetime, start_date::timestamp)
device_type_concept_id type_concept_id
unique_device_id NULL
production_id NULL
quantity NULL
provider_id NULL
visit_occurrence_id visit_occurrence_id
visit_detail_id NULL
device_source_value source_value
device_source_concept_id source_concept_id
unit_concept_id unit_concept_id
unit_source_value unit_source_value
unit_source_concept_id NULL

9.9 Table name: MEASUREMENT

9.9.1 Reading from STEM (filtered on domain_id = ‘Measurement’)

Destination Field Source Field Logic/remarks
measurement_id uid
person_id person_id
measurement_concept_id concept_id
measurement_date coalesce(start_date, end_date)
measurement_datetime coalesce(start_datetime, start_date::timestamp, end_datetime, end_date::timestamp)
measurement_time NULL
measurement_type_concept_id type_concept_id
operator_concept_id operator_concept_id
value_as_number quantity_or_value_as_number
value_as_concept_id value_as_concept_id
unit_concept_id unit_concept_id
range_low range_low
range_high range_high
provider_id NULL
visit_occurrence_id visit_occurrence_id
visit_detail_id NULL
measurement_source_value source_value
measurement_source_concept_id source_concept_id
unit_source_value unit_source_value
unit_source_concept_id unit_source_concept_id
value_source_value value_source_value
measurement_event_id event_id
meas_event_field_concept_id event_field_concept_id

9.10 Table name: SPECIMEN

9.10.1 Reading from STEM (filtered on domain_id = ‘Specimen’)

Destination Field Source Field Logic/remarks
person_id person_id
specimen_id specimen_id
specimen_concept_id concept_id
specimen_type_concept_id type_concept_id
specimen_date coalesce(start_date, end_date)
specimen_datetime coalesce(start_datetime, start_date::timestamp, end_datetime, end_date::timestamp)
quantity quantity_or_value_as_number
unit_concept_id unit_concept_id
anatomic_site_concept_id anatomic_site_concept_id
disease_status_concept_id disease_status_concept_id
specimen_source_id source_concept_id
specimen_source_value source_value
unit_source_value unit_source_value
anatomic_site_source_value anatomic_site_source_value
disease_status_source_value disease_status_source_value

9.11 Table name: OBSERVATION

9.11.1 Reading from STEM (filtered on domain_id = ‘Observation’)

Destination Field Source Field Logic/remarks
observation_id uid
person_id person_id
observation_concept_id concept_id
observation_date coalesce(start_date, end_date)
observation_datetime coalesce(start_datetime, start_date::timestamp, end_datetime, end_date::timestamp)
observation_type_concept_id type_concept_id
value_as_number quantity_or_value_as_number
value_as_string value_as_string
value_as_concept_id value_as_concept_id
qualifier_concept_id NULL
unit_concept_id unit_concept_id
provider_id provider_id
visit_occurrence_id visit_occurrence_id
visit_detail_id NULL
observation_source_value source_value
observation_source_concept_id source_concept_id
unit_source_value unit_source_value
qualifier_source_value qualifier_source_value
value_source_value value_source_value
observation_event_id event_id
obs_event_field_concept_id event_field_concept_id

9.12 Table name: DRUG_EXPOSURE

9.12.1 Reading from STEM (filtered on domain_id = ‘Drug’)

Destination Field Source Field Logic/remarks
drug_exposure_id uid
person_id person_id
drug_concept_id concept_id
drug_exposure_start_date coalesce(start_date, end_date)
drug_exposure_start_datetime coalesce(start_datetime, start_date::timestamp, end_datetime, end_date::timestamp)
drug_exposure_end_date coalesce(end_date, start_date)
drug_exposure_end_datetime coalesce(end_datetime, end_date::timestamp, start_datetime, start_date::timestamp)
verbatim_end_date NULL
drug_type_concept_id type_concept_id
stop_reason NULL
refills
quantity quantity_of_value_as_number
days_supply NULL
sig NULL
route_concept_id route_concept_id
lot_number NULL
provider_id NULL
visit_occurrence_id visit_occurrence_id
visit_detail_id NULL
drug_source_value source_value
drug_source_concept_id source_concept_id
route_source_value route_source_value
dose_unit_source_value NULL

9.13 Table name: OBSERVATION_PERIOD

9.13.1 Reading from clinical tables (including visit_occ)

NOTE: min/max dates all established from dates across all filled in clinical tables

Destination Field Source Field Logic/remarks
observation_period_id Auto-generated integer
person_id person_id
observation_period_start_date MIN(EVENT [START] DATES)
observation_period_end_date MAX(EVENT [END] DATES)
period_type_concept_id 32817 EHR or 32879 Registry